Anergy in self-directed B lymphocytes from a statistical mechanics perspective. 
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The ability of the adaptive immune system to discriminate between self and non-self mainly 
stems from the oritogcnic clonal-dclction of lymphocytes expressing strong binding affinity with sclf- 
peptidcs. However, some self-directed lymphocytes may evade selection and still be harmless due 
to a mechanism called clonal anergy. 

As for B lymphocytes, two major explanations for anergy developed over three decades: according 
to "Varela theory", it stems from a proper orchestration of the whole B-repertoire, in such a way 
that self-reactive clones, due to intensive interactions and feed-back from other clones, display more 
inertia to mount a response. On the other hand, according to the 'two-signal model", which has 
prevailed nowadays, self-reacting cells are not stimulated by helper lymphocytes and the absence of 
such signaling yields anergy. 

The first result we present, achieved through disordered statistical mechanics, shows that helper 
cells do not prompt the activation and proliferation of a certain sub-group of B cells, which turn 
out to be just those broadly interacting, hence it merges the two approaches as a whole (strictly 
speaking Varela theory is then included into the two-signal model, not vice- versa). 

As a second result, we outline a minimal topological architecture for the B-world, where highly 
connected clones are self-directed as a natural consequence of an ontogenetic learning; this provides 
a mathematical framework to Varela perspective. 

As a consequence of these two achievements, clonal deletion and clonal anergy can be seen as two 
inter-playing aspects of the same phenomenon too. 

PACS numbers: 87.16.Yc, 02.10.Ox, 87.19.xw, 64.60.De, 84.35.-|-i 



I. INTRODUCTION 

The adaptive response of the immune system is per- 
formed through the coordination of a huge ensemble of 
cells (e.g. B cells, helper and regulatory cells, etc.), each 
with specific features, that interact both directly and via 
exchanges of chemical messengers as cytokines and im- 
munoglobulins (antibodies) [1]. In particular, a key role 
is played by B cells, which are lymphocytes characterized 
by membrane-bound immunoglobulin (BCR) working as 
receptors able to specifically bind an antigen; upon acti- 
vation, B cells produce specific soluble immunoglobulin. 
B cells are divided into clones: cells belonging to the 
same clone share the same specificity, that is, they ex- 
press the same BCR and produce the same antibodies 
(hyper-somatic mutations apart [1]). When an antigen 
enters the host body, some of its fragments are presented 
to B cells, then, the clones with the best-matching recep- 
tor, after the authorization of helpers through cytokines, 
undergo clonal expansion and release a huge amount of 
antibodies in order to kill pathogens and restore order. 

This picture, developed by Burnet [2] in the 50's and 
verified across the decades, constitutes the "clonal selec- 
tion theory" and, when focusing on B-cells only, can be 
looked at as a one-body theory [3]: The growth (drop) 
of the antigen concentration elicits (inhibits) the specific 
clones. 



One step forward, in the 70's, Jerne suggested that, 
beyond antigenic stimulation, each antibody must also 
be detected and acted upon by other antibodies; as a re- 
sult, the secretion of an atypically large concentration of 
antibodies by an active B clone (e.g. elicited due to an 
antigen attack) may even prompt the activation of other 
B clones that best match those antibodies [4] . This mech- 
anism, experimentally well established (see e.g. [5, 6]), 
underlies a two-body theory and (possibly) gives rise to 
an effective network of clones interacting via antibodies, 
also known as "idiotypic network". 

The B repertoire is enormous (^ 10^ in humans) and 
continuously updated due to the random gene-reshuffling 
occurring during B-cell ontogenesis in the bone marrow 
[1]. The latter process ensures the diversity of the reper- 
toire and therefore the ability of the immune system 
to recognize many different antigens, but, on the other 
hand, it also inevitably produces cells able to detect and 
attack self-proteins and this possibly constitutes a serious 
danger. In order to avoid the release of such auto- reactive 
cells, safety mechanisms are at work during the ontogen- 
esis, yet, some of them succeed in escaping through "re- 
ceptor editing" (self-reactive cells substitute one of their 
receptors on their immunoglobulin surface) [7] or "clonal 
anergy" (self-reactive cells that have not been eliminated 
or edited in the bone marrow become unresponsive, show- 
ing reduced expression level of BCR) [8, 9]. 
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In the last decades, two main strands have been pro- 
posed to explain clonal anergy, both supported by ex- 
perimental evidence: The former, introduced by Varela 
[10-12], allows for B cells only, while the latter, referred 
to as the two-signal model [8, 9, 13], allows for both B 
and helper T cells. 

According to Varela's theory, each clone jj, corresponds 
to a node in the idiotypic network, with a (weighted) 
coordination number Wf^ (i.e. the sum of the binding 
strengths characterizing its possible interactions with all 
other clones) , which represents a measure of the tolerance 
threshold of the clone: Clones corresponding to poorly 
(highly) connected nodes are easily (hardly) allowed to 
respond to the cognate stimulus. In this way the idio- 
typic network maintains a rcgTilatory role, where a " core" 
of highly (weighted) connected clones acts as a safe-bulk 
against self-reactions. Experimental evidence of this phe- 
nomenon has been obtained along the years [10-12], but, 
even so, given the huge size of the B-repertoire, an ex- 
tensive experimental exploration has always been out of 
reach, in such a way that the initial promising perspec- 
tives offered by the theory were never robustly actualized, 
and interest in this approach diminished. 

Conversely, according to the modern two-signal model, 
the activation of a B-cell (i.e. antibody production and 
clonal expansion of its lineage) requires two signals in 
a given (close) time interval: the first one is delivered 
by the antigen binding to the BCR, the second one is 
provided by a helper T lymphocyte, which elicits the B- 
growth through cytokines [72]. In the absence of the 
second signal, armed clones enter a "safe mode" [7, 14], 
being unable to either proliferate or secrete immunoglob- 
ulins. This explanation for anergy largely prevailed as, 
being based on a local mechanism, its experimental ev- 
idence is undoubtable, however, it raises the puzzling 
question of how self-directed B-cells become "invisible" 
to helpers [15] and, also, it does not incorporate previous 
findings of Varela picture, whose experimental evidences 
should however be framed in this prevailing scheme. 

Aim of this paper is trying to answer these questions 
through techniques stemmed from theoretical physics: 
Interestingly, the scenario we outline robustly evidences 
that highly connected B cells are transparent to helpers, 
hence merging the two mechanisms for anergy. 



II. METHODS 

In this work we rely on a statistical-mechanics (SM) 
modellization of the immune system. Indeed, SM, based 
on solid pillars such as the law of large number and the 
max:imum entropy principle [16], aims to figure out col- 
lective phenomena, possibly overlooking the details of 
the interactions to focus on the very key features. De- 
spite this certainly implies a certain degree of simplifi- 
cation, SM, merging thermodynamics [17] and informa- 
tion theory [18], has been successfully applied to a wide 
range of fields, e.g., material sciences [19, 20], sociology 



[21, 22], informatics [23], economics [24, 25], artificial in- 
telligence [26, 27], and system biology [28, 29]; SM was 
also proposed as a candidate instrument for theoretical 
immunology in the seminal work by Parisi [30]. Indeed, 
the systemic perspective offered by SM nicely fits emer- 
gent properties as collective effects in immunology, as for 
instance discussed by Germain: "as one dissects the im- 
mune system at finer and finer levels of resolution, there is 
actually a decreasing predictability in the behavior of any 
particular unit of function", furthermore, "no individual 
cell requires two signals (...) rather, the probability that 
many cells will divide more often is increased by costim- 
ulation" [31]. Understanding this averaged behavior is 
just the goal of SM. 

Moreover, concepts such as "decision making" , "learn- 
ing process" or "memory" are widespread in immunol- 
ogy [32-34], and shared by the neural network sub-shell 
[26, 27] of disordered SM [35]: Clones, existing as either 
active or non-active and being able to collectively inter- 
act, could replace the digital processing units (e.g. flip 
flops in artificial intelligence [36] , or neurons in neurobiol- 
ogy [37]) and cytokines, bringing both eliciting and sup- 
pressive chemical signals, could replace connections (e.g. 
cables and inverters in artificial intelligence, or synapses 
in neurobiology). 

As a last remark, we stress that, as typical in SM 
formalization (see e.g. [27]), we first develop the 
simplest scenario, namely we assume symmetry for 
the interactions among B and T cells. Despite this is 
certainly a limit of the actual model, it is has the strong 
advantage of allowing a clear equilibrium picture still 
able to capture the phenomenology we focus on, and 
whose off-equilibrium properties (immediately achievable 
in the opposite, full asymmetric, limit) should retain 
strong similarities with the present picture and will be 
addressed in future investigations. 

Having sketched the underlying philosophy of our 
work, we highlight our two key results: We first consider 
the B-H network and show that helpers are unable to 
communicate with highly connected B-cells; Then, we 
consider the set of B clones and show that a minimal 
(biased) learning process, during B-cell clonal deletion 
at ontogenesis, can shape the final repertoire such that 
highly connected B clones are typically self- directed. 
These two points together allow to merge the two-signal 
model and Varela's theory. 

The plan of the paper can be summarized by the 
following syllogism: 

Part I: Anergy induced by T cells and the "two- 
signal model". 

• Fact: The response of B-cells is prompted by two 
signals: the presence of an antigen and the "con- 
sensus" by an helper T lymphocyte. 

• Consequence: The ensembles made of by B and 
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helper clones interact as a (diluted [67]) bilayer re- 
stricted Boltzmann machine. 

• Consequence: This system is (thermodynamically) 
equivalent to an associative "neural" network, 

whose equilibrium states correspond to optimal or- 
chestrations of T cells in such a way that a B clone 
is maximally signaled and hence prompted to react; 
each equilibrium state is univocally related to a B 
clone. Remarkably, the activation of B-clones with 
high weighted connectivity corresponds to negligi- 
ble basins of attraction, hence they are rarely sig- 
naled by helpers. 

Part II. Anergy induced by B cells and "Varela theory". 



probabilistic combinatorial usage of the VDJ recombina- 
tion [39] (when thinking at bit-string entries as genes) 
and pioneering direct experimental evidence [47] (when 
thinking at bit-string entries as epitopes). 

Antibodies can bind each-other through "lock-and- 
key" interactions, that is, interactions are mainly hy- 
drophobic and electrostatic and chemical affinities range 
over several orders of magnitude [1]. This suggests that 
the more complementary two structures are and the more 
likely (on an exponential scale) their binding. We there- 
fore define x as a Hamming distance 



X^[vI/^(l-vl/^) + vI/^(l-*^)], 



(1) 



fe=l 



• Fact: Antibodies (as any other protein) are not 
random objects (for instance, randomly generated 
proteins may not even be able to fold into a stable 
structure [38]) [39]. Hence, once expressed trough 
e.g. bit-strings of information, the related entropy 
is not maximal. 

• Consequence: In the idiotypic network where B- 
clones are nodes and (weighted) links among them 
mirror the interactions through the related anti- 
bodies, nodes with higher weighted connectivity are 
lazier to react and typically self-directed (Varela 
Theory). 



III. PRELIMINARY REMARKS ON THE 
STRUCTURE OF THE B-NETWORK 

There are several approaches in estimating the struc- 
ture, size and shape of the mature B repertoire. For 
instance, in their pioneering works, Jerne and Burnet 
used a coarse-grained description in terms of epitopes 
and paratopes [2, 4], then Perelson extended (and sym- 
metrized) them introducing a shape space [45], De Boer 
and coworkers dealt directly with peptides of fixed length 
[46] , while Bialek, Callan and coworkers recently used the 
genetic alphabet made of by the VDJ genes codifying for 
the heavy and light chains of the immunoglobulins [39] 
[73]. 

Proceeding along a general information theory per- 
spective, we associate to each antibody, labeled as /i, a 
binary string of length L, which effectively carries 
information on its structure and on its ability to form 
complexes with other antibodies or antigens. Since anti- 
bodies secreted by cells belonging to the same clone share 
the same structure, the same string is used to encode 
the specificity of the whole related B clone. In this way, 
the repertoire will be represented by the set B of prop- 
erly generated strings and its cardinality Nb = \B\ is the 
number of clones present in the system. L must be rel- 
atively short with respect to the repertoire size Ng, i.e. 
L = jIuNb, 7 G M"*" [3]. This choice stems from both the 



to measure the complementarity between two bit-strings 
and introduce a phenomenological coupling 
(whose details will be deepened in Sec. IV, see also [3, 48]) 



J til/ 0^ ^ 



(2) 



where a tunes the interaction strength. In this way, a 
network where nodes are B-clones, and (weighted) links 
are given by the coupling matrix J, emerges (see Fig. 1, 
uppermost panel, and [3, 48-51] for details). This for- 
malizes Jerne's idiotypic network. 

In general, several links may stem from the same 
node, say fi, and we define its weighted degree as = 

S^^i JiJ.1/- When the system is at rest, we can argue that 
all B clones are inactive, so that if clone /z is stimulated, 
Wf^ can be interpreted as the "inertia" of lone /i to re- 
act, due to all other cells [52]: This mechanics naturally 
accounts also for the low dose phenomenon [1, 3, 52]. 

Finally, it is worth considering how W is distributed as 
this provides information about the occurrence of inertial 
nodes in the system. Exploiting the fact that couplings 
Jp^ are log- normally distributed [48], one can approxi- 
mate the distribution P{W) as 



P{W) 



1 



W^v/27rCT 



(3) 



in such a way that mean and variance read as E{W) = 

e^+'^'/2, V{W) = (e'^''-l)e2^+'^', respectively (adetailed 
discussion on the parameters a and jj can be found in 
Sec. V and in Appendix Five). 

We stress that the log-normal distribution evidenced 
here agrees with experimental findings [63]. Furthermore, 
its envelope remains log-normal even if the network is 
under-percolated [48]. Thus, in order to have a broad 
weighted connectivity, the effective presence of a large, 
connected B-network is not a requisite, but, basically, 
the mere existence of small-size components, commonly 
seen in experiments [5, 6], is needed. 
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FIG. 1: Schematic representation of the immune networks 
considered here, where we fixed Nh = 30 and Nb ~ 20. The 
uppermost plot describes the B-B network: each B-cell jj, cor- 
responds to a different arch, whose length is proportional to 
the related weighted degree W^, and the interaction between 
cells fj. and v corresponds to the link connecting the related 
arches, whose thickness is proportional to J^u- The middle 
plot describes the bipartite B-H network: the external set of 
white circles corresponds to the set of T cells, while the in- 
ternal set of colored circles corresponds to the set of B cells 
and their size is proportional to the related weighted degree, 
according to the plot above. The interaction ^ between T 
cells and B cells can be either excitatory (bright link) or in- 
hibitory (dark link). The lowermost plot describes the H-H 
network: the white circles correspond to the set of T cells 
and connections between them are drawn according to for- 
mula X^p(Ci'C^)/W^M! explained in the text; the color and 
the thickness of the link carry information about the sign and 
the magnitude of the coupling, respectively. 



IV. ANERGY INDUCED BY T CELLS AND 
THE "TWO-SIGNAL MODEL". 

A. Stochastic dynamics for the evolution of clonal 

size 

We denote with 6^ £ K the "degree of activation" of 
the B clone fi with respect to a reference value bQ, such 
that if the clone is in its equilibrium (i.e., at rest) — 
bo, while if the clone is expanded (suppressed) 6^ > bp 
(6^1 < bo); again, we adopt the simplest assumption of 
fixing a unique reference state 5o = for all the clones; 
the case of tunable bg was treated in [64] . 

Concerning T cells, both helper and regulatory sub- 
classes share information with the B branch via cy- 
tokines. Hence, we group them into a unique ensemble 
of size Nh, and denote the state of each clone by hi{i = 
1, ...,Nh); hereafter we call them simply "helpers". We 
take hi — ±1 such that hi ~ +1 stands for an active state 
(secretion of cytokines) and vice versa for —1; actually 
the choice of binary variables is nor a biological requisite 
neither a mathematical constraint, but it allows to keep 
the treatment as simple as possible, yet preserving the 
qualitative features of the model that we want to high- 
hght. 

We define e = Nb/Nh and, to take advantage of 
the central limit theorem (CLT), we focus on the infi- 
nite volume (thermodynamic limit, TDL), such that, as 
Nb — >■ oo and Nu — > oo, e is kept constant as, experi- 
mentally, the global amount of helpers and of B-clones is 
comparable. 

Recalling that B clones receive two main signals, i.e. 
from other B clones and from T ones, we can introduce 
the Langevin dynamics for their evolution as 



dt 



Nb 



(4) 

where r rules the characteristic timescale of B cells and 
t' is the timescale of a white noise rj G A^[0, 1]. The ratio 
between the influence of the noise on the B-H exchanges 
and the influence on the B-B interactions is tuned by 
/3. The coupling between the /i-th B clone and the i- 
th T clone is realized by the ensemble of cytokine 
(see Fig. 1, middle panel) and Ak is a generic antigenic 
peptide that interacts with B-clones through the coupling 

As far as all the interactions are symmetric [74], the 
Langevin dynamics admits a Hamiltonian description as 



where, by integration over 6^, 

Nb,Nb 



H 



Nh,Nb 



Nh.Nb Nb 

(5) 
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Each contribution appearing in the r.h.s. of the previous 
equation is deepened in the foUowing: 

• The first term comes fi'om B-clone interactions via 
immunoglobuhn, which is translated into a diluted 
"ferromagnetic" coupling Jj^^ > 0, as B clones tend 
to "imitate" one another. Notice that the square 
{b^ — b^Y' generalizes the ferromagnetic behavior, 
typically referred to binary Ising spins, to the case 
of "soft spins" variables: the usual, two-body term 
b^b,j [17] is clearly recovered, while the two extra 
terms 6^ encode a one-body interaction that here 
promotes B-cell quiescence in the absence of stim- 
ulation. 

• The second term represents the coupling between 
B and T clones, mediated by cytokines: The cyto- 
chine is meant to connect cells of the i-th helper 
clone and those of the /i-th B one. The message 
conceived can be either excitatory (^f = +1, e.g. 
an eliciting Interleukin-2) or inhibitory = —1, 
e.g. a suppressive Interleukin-10) and here is as- 
sumed to be a quenched variable, such that the one 
with inhibitory effects can be associated to a regu- 
latory cell and, viceversa, the one with stimulating 
effect to an helper cell. Note that the choice ±1 
for is only a convenient requisite encoding two 
opposite effects, while, clearly, their world is by far 
richer [41], and, in principle, also mathematically 
accessible. 

• The third term mimics the interaction of the 
generic 6^ clone with the antigenic peptide Ak 
where J^/j encodes their coupling strength and can 
be defined according to eq. 2. 

Interestingly, in the Hamiltonian 5, the first term re- 
covers Jerne's idiotypic network theory, the second one 
captures the two-signal model and the third one recov- 
ers Burnet's clonal selection theory: within this SM 
framework the three approaches are not conflicting, but, 
rather, interplaying. 

Close to equilibrium, whose investigation is our first 
goal, the antigenic load is vanishing [A^ — for all k) 
and the anti-antibodies can consequently be neglected 
{^n^v ~ 0), hence the Langevin process defined in eq. 4 
simplifies to 

(6) 

where = "^mj' (weighted) connectivity of 

the /i-th node (clone) of the B-network. 
Therefore, the Hamiltonian of the process is 

nN^,NB^^Y.^'X-^ E ^^'^^^^ (7) 

and its properties will be addressed in the next section 
through statistical mechanics. 



B. The equivalence with associative networks 

Once the effective Hamiltonian is defined through eq. 6, 

the classical statistical mechanics package can be intro- 
duced; this implies the partition function 

{h}-' u. 

(8) 

and the quenched free-energy (neglecting constant terms 
which do not affect the scenario) 

A{l3,e\P{W))= hm -)-ElnZ;v„,^3(/3[C, W^), 

(9) 

where E averages over both the ^ and the W distribu- 
tions. 

Notice that the idiotypic contribution in the stochastic 
process (6) implicitly generates a Gaussian distribution 
for the activity of the B-clones 

P{b^\W) oc exp {-WX^/2) , (10) 

which ensures convergence of the Gaussian integrals. 

This is consistent with commonly observed data and en- 
sures convergence of the integrals in the partition func- 
tion 8; interestingly, plays as variance. 
A crucial point is that the integrals over {b^} in the par- 
tition function 8 can be calculated explicitly to give 

( Nh,Nh Nb flJ-cti \ 
^ E Ei^'^A- • 
i,j ^ / 

(11) 

The previous expression deserves attention because it 

corresponds to the partition function of a (log-normally 
weighted) Hopfield model for neural networks ([26]), (see 
Fig. 1, lowest panel): Its Hebbian kernel suggests that 
the network of helpers is able to orchestrate strate- 
gies (thought of as patterns of cytokines) if the ratio 
e = Nb/Nh does not exceed a threshold [64], in agree- 
ment with the breakdown of immuno- surveillance occur- 
ring whenever the amount of helpers is too small (e.g. in 
HIV infections) or the amount of B is too high (e.g. in 
strong EBV infections) [75]. 

^ C. High connectivity leads to anergy 

As anticipated, the network made of by helper cells can 
work as a neural network able to retrieve "patterns of in- 
formation" . There are overall Nb patterns of information 
encoded by cytokine arrangement {^} and the retrieval of 
the pattern /z means that the state of any arbitrary i-th 
T clone agrees with the cytokine , namely /li^f = +1; 
this ultimately means that clone is maximally stimu- 
lated. A schematic representation of retrieval performed 
by T cells and of its consequence on the repertoire of B 
cells is depicted in Fig. 2. 
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FIG. 2: Schematic representation of the consequence of re- 
trieval capabihties by the helper network in the bipartite net- 
work made of by both helpers and B-clones: In the upper 
panel a free-energy landscape of the helper network, with four 
minima (each corresponding to retrieval of instruction for a 
particular B-clone) is shown. The black ball represents the 
state of the system, which is driven into the yellow hole (e.g. 
due to antigenic stimulation). Consequently, as all the helpers 
in the bipartite network (lower panel) become parallel to the 
sign of the cytokines linking them to the yellow B-clone. This 
results in majcimal strength conferred to the retrieved clone, 
that undergoes clonal expansion. The latter is represented in 
the middle plot, together with the lack of growth by the other 
clones (not-retrieved) . 



Here, with respect to standard Hopfield networks, Heb- 
bian couplings are softened by the weighted connectivity 
Wf^ and this has some deep effects. In fact, the patterns 
of information which can be better retrieved (i.e. the 
clones which can be more intensively signaled) are those 
corresponding to a larger signal, that is, a smaller W. 
Thus, B-clones with high weighted connectivity (the safe- 
bulk) can not be effectively targeted and, in the TDL, 
those B-clones exhibiting W ^ oo are completely "trans- 
parent" to helper signaling. 

Deepening this point is now mainly technical. We in- 
troduce the Nb pattern-overlaps (to^), which measure 
the extent of pattern retrieval, i.e. signaling on clone /i, 
and are defined as (m^) = EiV^'l^(Ef" where n 



is the standard Boltzmann state [17] associated to the 
free energy 9, which allow to rewrite the Hamiltonian 
corresponding to Eq. (II) as 

(12) 

Now, free energy minimization implies that the sys- 
tem spontaneously tries to reach a retrieval state where 
{rrifj} — I for some fi. Of course, this is more likely for 
clones fi with smaller W^, while highly connected ones 
are expected not to be signaled (pathological cases apart, 
i.e. no noise /3 — oo, or giant clonal expansions bo ^ oo 
limits). 

Note that (to)^ = I (gauge- invariance apart) means 
that all the helpers belonging to the clone i are parallel 
to their corresponding cytokine, hence if is an elicit- 
ing messenger, the corresponding helper hi will be firing, 
viceversa for = I the corresponding helper hj will be 
quiescent, so to confer to the 6^ clone the maximal ex- 
pansion field. 

In order to figure out the concrete existence of this re- 
trieval, we solved the model through standard replica 
trick [35], at the replica symmetric level (see Appendix 
One), and integrated numerically the obtained self- 
consistence equations, which read off as 

(mi(e,^)) = ((^itanh(/3(miCVW^i + Ve^^))>.)e,w, 
(g(e,/3)) = ((tanh2(/3(mieVM^i + V^2))>.)e,w, 

-, Nb 

(r(e,l3)) = lim ^ —. (13) 

^ ^ ' Nh^^ eNH f^^ [W^ - 13(1 - qW 

In this set of equations, we used the label 1 to denote 
a test B-clone = I, which can be either a self node 
(i.e. with a high value of Wi, infinite in the TDL) or a 
non-self one (i.e. with a small value of Wi, zero in the 
TDL). While the first equation defines the capability of 
retrieval by the immune network as earlier explained, q 
is the Edward- Anderson spin glass order parameter [35] 
and r accounts for the slow noise in the network due both 
to the number of stored strategies and to the weighted 
connectivity [76]. 

As shown in the Appendix Two, the equations above 
can be solved in complete generality. Here, for simplic- 
ity, we describe the outcome obtained by replacing all 
with ^ ^ I (as ^ = I is the test-case) with their 
average behavior, namely {W) = / dWP{W)W; this as- 
sumption makes the evaluation of the order parameter r 
much easier, yet preserving the qualitative outcome. 

We now focus on the two limiting cases: Wi << {W), 
which accounts for a non-self node, and Wi >> (W), 
which mirrors the self counterpart. 

In the former case, the slow noise is small (vanishing as 
(W) — oo), consequently, the non-self nodes live in a free 
environment and the corresponding equations for their 
retrieval collapse to the not-saturated Hopfield model 
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FIG. 3: Schematic representation of the (free-energy) basins of attractions for a toy-system starting (at left) with four minima 
(hence four retrievable patterns). Each minimum contains information addressed to the corresponding B-clone so that four 
B-clones Bi, B2, B3, B4, can be instructed in the initial configuration. From left to right we fix W2 = W4 = 1 always, while we 
increase progressively Wi = W3 = 1, 5, 10, 100 (and we show the resulting basins of attraction from left to right). Note that at 
value of the weighted connectivity Wi — W3 = 100, the corresponding minima completely disappear hence instructions to the 
corresponding B-clones (which are broadly interacting as their W is much higher than W2 = W4 — 1) can not be supplied by 
helpers. 



[26]. Hence, retrieval should be always possible (ergodic 
limit apart), therefore, in this case, helpers can effectively 
signal clone 1. 

Conversely, in the latter case, namely dealing with a 
self-node, it is straightforward to check that the noise 
rescaling due to W implies a critical noise level for the 
retrieval /3^^ ^ W^^ ~ (as Wi is ideally diverging in 
the thermodynamic limit, see Fig. 5 and Sec. V). As a 
result, under normal conditions, the retrieval of patterns 
enhancing self-node clonal expansions is never performed 
by helpers: This behavior mimics anergy as a natural 
emergent property of these networks. 
As a further numerical check we performed Monte Carlo 
simulations which are in agreement with these findings. 



V. ANERGY INDUCED BY B CELLS AND 
"VARELA THEORY". 

So far we showed that helper cells are unable to ex- 
change signals with highly connected B-clones, however, 
the reason why the latter should be self-directed is still 
puzzling. Now, we build a basic model for the ontoge- 
netic process of B cells, which solely assumes that self 
proteins are not random objects, and we show that sur- 
vival clones expressing large self-avidity are those highly 
connected. 



A. Ontogenesis and the emergence of a biased 
repertoire. 

During ontogenesis in the bone marrow, B-cell survival 
requires sufficiently strong binding to at least one self 
molecule (positive selection), but those cells which bind 
too strongly are as well deleted (negative selection): such 
conditions ensure that surviving B cells are neither aber- 
rant nor potentially harmful to the host [53, 54]. 

To simulate this process, we model the ensemble of 
self- molecules as a set S of strings of length L, whose 
entries are extracted independently via a proper distri- 



bution. The overall number of self-molecules is \S\ = Ng, 
that is, = 1, ...,Ns- 

As stated in the introduction, despite a certain degree 
of stochasticity seems to be present even in biological sys- 
tems, proteins are clearly non-completely random objects 
[38]: Indeed, the estimated size of the set of self-proteins 
is much smaller than the one expected from randomly 
generated sets [39]. Within an information theory con- 
text, this means that the entropy of such repertoire is not 
maximal, that is, within the set S some self-proteins are 
more likely than others (see Appendix Three). 

In order to account for this feature, we generate S 
extracting each string entry i according to the simplest 
biased-distribution 

p.eif($ria) = <5(a>r - + m i)^, (m) 

where 6{x) is the Kronocker delta and a E [—1,1] is a 
parameter tuning the degree of bias, i.e. the likelihood 
of repetitions among string-bits. Of course, when a = 
the complete random scenario is recovered. We stress 
that here, looking for minimal requisites, we neglect cor- 
relations among string entries [39], in favor of a simple 
mean-field approach where entries are identically and in- 
dependently generated. 

As underlined above, a newborn B cell, represented by 
an arbitrary string "i, undergoes a screening process and 
the condition for survival can be restated as 

Xp < max{x(«', $)} < Xn, (15) 

being xp and xn the thresholds corresponding to posi- 
tive and negative selection, respectively. 

As explained in Appendix Four, the value of the pa- 
rameters Xp a-^d xn can be fixed according to indirect 
measurements, such as the survival probability of new- 
born B cells: it is widely accepted that human bone mar- 
row produces daily ^ 10'' B cells, but only ^ 10^ are 
allowed to circulate in the body, the remaining 90% un- 
dergo apoptosis since targeted as self-reactive ones [57- 
60] ; therefore the expected survival probability for a new- 
born B cell is Psurv = 0.1 (see Fig. 4, left panel). 
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FIG. 4: Plots from simulations where we generated random 
strings ^ and we compared them with those in S and gener- 
ated according to the distribution in Eq. 36. Strings ^ fulfill- 
ing the condition 15 are retained and the survival probability 
^surv is measured and plotted versus a (left panel). The fi- 
nal repertoire B turns out to be as well biased with degree a 
depending on a (central panel). Moreover, we measured the 
Spearman correlation coefficient p, averaged over B, between 
Wf_i and max$i=5{x('I'(j, *&)} (right panel): notice that a posi- 
tive value denotes the existence of correlation and gives strong 
numerical evidence of Varela theory. Data represented in 
these plots refer to a system where we fixed the size of the B- 
repertoire Nb = W'^ and 7 = 2,c = 0.5, A = 0.4, xp = 0.6L 
(see Appendix Four for more details) ; data were averaged over 
10'' realizations. 



Thus, we extract randomly and independently a string 
^' and we check whether Eq. 15 is fulfilled; if so, the string 
is selected to make up the repertoire B. We proceed 
sequentially in this way until the prescribed size Nb is 
attained (see Appendix Four for more details). 

The final repertoire is then analyzed finding that the 
occurrence of strings entries is not completely random, 
but is compatible with a biased distribution such as 



P,ep(*r|a) = 5(v|/f-l 



1 



I) 



1 



(16) 



where a turns out to be correlated with a. More precisely, 
positive values of a yield a biased mature repertoire with 
a > (see Fig. 4, central panel). Consequently, in the 
set B generated in this way, nodes with large W^, and 
therefore dissimilar with respect to the average string, 
are likely to display large affinity with the self repertoire. 
To corroborate this fact we measured the correlation p 
between the weighted degree of a node and the affin- 
ity max$g5{x(\I'^, $)} with the self-repertoire finding a 
positive correlation (see Fig. 4, right panel). We also 
checked the response of the B-repertoire when antigens 
are presented, finding that, when a string $1, e 5 is taken 
as antigen, the best-matching node, displaying large W , 
needs an (exponentially) stronger signal on BCR in order 
to react. 

Such results mirror Varela's theory [11, 12], according 
to which "self-directed" nodes display a high (weighted) 
connectivity, which, in turn, induces inhibition. 

Finally, it is worth underlying that, by taking a biased 
distribution for string entries (i.e., a 7^ 0), the distribu- 
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FIG. 5: In the upper part of this figure we show a phase dia- 
gram concerning the distribution of the interaction strengths 
of the idiotypic network. More precisely, being 7 = 2 fixed, 
we highlight different regions of the {a?, a) plane, each cor- 
responding to a different behavior of the average coupling 
E{J) = {J)a and of the variance V{J) = (J^)a — {J)l,, as 
explained by legend. Different behaviors of E{J) and V{J) 
can be recast into different topological regimes as envisaged 
by the graphs depicted in the lower part of the figure, repre- 
senting particular realizations of the idiotypic network, and 
referring to the particular choice a — 0.7, A'^ = 10'' and to dif- 
ferent values of a (see also [48]). We underline that different 
regions imply different thermodynamic regimes which can be 
associated to different immunological capabilities. 



tion P{W) for weights occurring in the idiotypic network 
still retains its logarithmic shape, namely 



P{W) 



1 



lFV27rcr 



(17) 



with 



log 



= log 



Nb{J)1 



VWa 



-Nb{J)1 



{J)1)/Nb 



Nb{J)1 



(18) 



(19) 



where (J) a and (J^)a are, respectively, the mean value 
and the mean squared value of coupling J^^^ defined 
in Eq. (2). A detailed derivation of these values can 
be found in Appendix Five, while here we simply no- 
tice that, by properly tuning a and a, one can recover, 
in the thermodynamic limit, different regimes charac- 
terized by different behaviors (finite, vanishing or di- 
verging) for the average E{J) = {J)a and the variance 
V{J) = {J^)a — {J}a^ respectively, as reported in Fig. 5. 
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VI. CONCLUSIONS AND OUTLOOKS 

In this paper we tried a systemic approach for mod- 
ehng a subset of the adaptive response of the immune 
system by means of statistical mechanics. We focused 
on the emergent properties of the interacting lympho- 
cytes starting from minimal assumptions on their local 
exchanges and, as a fine test, we searched for the emer- 
gence of subtle possible features as the anergy shown by 
self-directed B-ccUs. 

First, we reviewed and framed into a statistical me- 
chanics description, the two main strands for its expla- 
nation, i.e. the two-signal model and the idiotypic net- 
work. To this task we described the mutual interaction 
between B cells and (helper and suppressor) T cells as a 
bi-partite spin glass, and we showed its thermodynami- 
cal equivalence to an associative network made of by T 
cells (helpers and suppressors) alone. Then, the latter 
have been shown to properly orchestrate the response of 
B cells as long as their connection within the bulk of the 
idiotypic network is rather small. In the second part we 
adopted an information theory perspective to infer that 
highly- connected B clones are typically self-directed as a 
natural consequence of learning during ontogenic learn- 
ing. 

By merging these; results we get that helpers are always 
able to signal non-self B lymphocytes, in such a way that 
the latter can activate, proliferate and produce antibodies 
to fight against non-self antigens. On the other hand, self 
lymphocytes, due to their large connectivity within the 
idiotypic network, do not feel the signal sent by helpers. 

Therefore, a robust and unified framework where the 
two approaches act synergically is achieved. Interest- 
ingly, this picture ultimately stems from a biased learning 
process at ontogenesis and offers, as a sideline, even a the- 
oretical backbone to Varela theory. We stress that, while 
certainly the Jerne interactions among B cells act as a 
key ingredient (and the existence of anti-antibodies or 
small reticular motifs has been largely documented), an 
over-percolated B network is not actually required as the 
distribution of the weighted clonal connectivity remains 
broad even for extremely diluted regimes. 

Furthermore, we note that, within our approach, while 
Varela theory is reabsorbed into the two-signal model, 
the the mutual is not true as clearly other cells (beyond 
highly connected ones in the B-repertoirc), trough other 
paths, may lack helper signalling and become anergic, 
hence the two-signal is not necessarily reabsorbed into 
Varela theory. 

Furthermore, the model developed is able to reproduce 
several other aspects of real immune networks such as the 
breakdown of immuno-surveillance by unbalancing the 
leukocitary formula, the low-dose tolerance phenomenon, 
the link between lymphocytosis and autoimmunity (as 
for instance well documented in the case of A.L.P.S.[64]) 
and the capability of the system to simultaneously cope 
several antigen [66, 67]. 

Despite these achievements, several assumptions un- 



derlying this minimal model could be relaxed or improved 
in future developments, ranging from the symmetry of 
the interactions, to the fully connected topology of the 
B-H interactions. 



VII. APPENDICES 

A. Appendix One: The replica trick calculation for 
the free energy 

In this section we want to figure out the expression of 
the free energy relative to the partition function (eq. 11) 
of a weighted Hopfield model near saturation (for values 

e 7^ 0) whose weight are drawn accordingly P{W). Its 
derivation is obtained using the "replica trick", namely 

— 1 

log Z = lim , 

n— >0 n 

within the replica symmetric approximation [35]. 
Through the latter, the free energy A(p, e) (hereafter sim- 
ply A for the sake of simplicity) can be written as 

1 " 
A= lim lim-— log( y exp{-^y^(/i»,0})| 

h^,...,h" a=l 

where we introduced the symbol a e (1, . . . , n) to label 
the different replicas and (•)^ indicates a quenched aver- 
age on the patterns ^. The replicated partition function 
averaged over the patterns ^ hence reads as 

{h} " IJ.,a 

(20) 

which is equivalent to eq. 11. Now, without loss of 

generality, we suppose to retrieve a number s of mem- 
orized patterns and we divide the sum over the Nb pat- 
terns in two sets: the former refers to the retrieved 
patterns (labeled with the index u = l,...,,s) while 
the latter refers to the not-retrieved ones (labeled with 
H = s + 1,...,Nb). 

The retrieved patterns sum can be manipulated intro- 
ducing n X s Gaussian variables in order to linearize the 
quadratic term in the exponent 




On the other side, the term corresponding to non re- 
trieved patterns can be written, after some computations 
including averaging over ^, as 
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J 2N 



-, Nb 

= exp{-- ^ Trln[i^(VK^)]}, 



m and r allows to get the self-consistent equations for m 
and q respectively. Conversely, by extremizing 24 with 
respect to q, one gets the self-consistent equation for r as 



jU=S+l 



(22) 

where = ^a6-(/3/^ff) Ef" The previous 

expression motivates the introduction of the family of 
n(n — 1) order parameters qab = ^■iid their 

conjugates qab through the identity 



') 



(23) 



= I {Yldq.bdqab)e''i''^^''"'^^^-^'^"^^'^\ 

a,b 



Putting all together and omitting negligible terms in Nh, 
we get 

A{l3,e)= lim lim — log / dm( TT rf^oferfgab) 
exp I^Nh 

n 



a,b 



1- ^ Tr In [K{W, {qab})] 

fJ, = S 

s.n 



2Nh 

P ^ i'K) 



fi = S+l 



n NB,n ^ -1 



In the last expression, the principal dependence from 
the system size Nh is in the global factor into the expo- 
nent, hence we can obtain the replicated free energy using 
the saddle-point method, i.e. extremizing the function in 
the exponent. Under replica-symmetry assumption we 
get 



A{m,q,r\p,t) 



lim 



Nh ^oc 2Nh 



Nb 



(1 - (q),) + (In 2 cosh [/? {j2wS + ^i^^)] ^ 

1^ = 1 

(24) 



where {■)z indicates the average over the measure 
diJ,{z) = exp {—z'^/2). We then obtain the self-consistent 
equations reported in the main text by extremizing 
A{m, q, r|/3, e) with respect to m, q, r. 



B. Appendix Two: Quenched evaluation of the 
slow noise order parameter r 

As we hinted in the main text and in the previous 
section, extremizing the free energy 24 with respect to 



(r(e,/?))= lim 



1 



Nb 



(25) 



In the TDL, the last expression can be rewritten through 
q 1 _(iogw-^ 



I 



- ^(1 - g))2 Ws/2ia 



(26) 

where we use eq. 50 and /j. and a arc given by eq. 48 and 
49. 

A more intuitive route (resembling annealing in spin 
glasses [35], but ultimately leading to qualitatively cor- 
rect results), consists in substituting in eq. 25 all 
different from Wi (/x = 1 is the test-case) with the mean 
value (W). Explicitly, 



{W) = Nb{J) = Nb cxp[(x)a(a + l)L - L] = N^ 



75-7+1 
B ' 



being L = 7 In Nb [3] , in the TDL three regimes survive 



(W) 



AT 7^-7-1-1 



•00, 
1, 

0, 



if 
if 

if 



6* > 1 - 

7' 
1 



e = i 



(27) 



So, when (W) — )■ 00, we can think at the test-clone Bi as 
non-self directed because its connectivity is smaller than 
the other ones, while when (W) — >■ we can think at 
the test-clone Bi as self directed, being its connectivity 
higher than the others. Accordingly, (r) can assume three 
different values: 



0, 

(l-/3(l-g))2' 
JL 



if 
if 

if 



(W) 
{W) 
(W) 



00, 

1, 

0. 



(28) 



Therefore we can discuss the following three situations: 

1. The typical B clone displays {W) — >■ 00, namely, it 
is more connected than the test-clone B^. Thus, Bi 

can be interpreted as a non-self clone [11]. In this 
case r is vanishing and the self-consistent equation 
for m is simply 



(29) 



which is the self-consistc!nt ciqnation for an Hop- 
field model away from saturation [26? ] with a 
rescaled noise level 13' = 13 /W\. Prom an immuno- 
logical point of view, this means that helpers can 
successfully exchange signals with the clone Bi un- 
der antigenic stimulation. 

2. The case (W) 1 has zero probability measure, it 
recovers the Hopfield neural model near saturation 
[26, 70], and can be skipped. 
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3. The typical B clone displays (W) 0. Hence, 
we can interpret Bi as a self-addressed. The self 
consistent equations in this case are 



((^Hanh ^( 



q = ((tanh^ /3 



Pil-q) 



(30) 



where we substituted the expression for r (third 
equation in 28) directly into the equation for m 
and g. 

As a result, B\, being much connected, can not feel 
helper signaling and therefore remain anergic. 



and the occurring frequency of a single entry. Recall- 
ing that each antibody is represented as a binary 
string of length L (vP* G (—1,1), i = 1 . . . L) whoso en- 
tries arc independent and identically distributed follow- 
ing P(*) = riili-Pl*')- Each probability distribution 
of a dichotomic random variable can be written following 
Eq. (37), 

p,ep(*ria) = m - + ^(*r - 1)^^' (31) 

where b{x) is the Kronecker delta and a G [—1,-1-1] tunes 
the extent of bias, namely the likelihood of repetitions 

among bitstrings, i.e. a = (^f*) G (—1, 1). If wc consider 
the set Ak = : J2i=i = '^^ is easy to see that 



C. Brcciking of the network performances: the 
ergodic and the random field thresholds 

To inspect where ergodicity is restored we can start 
trough the order parameter equation system 

q = (tanh^ + \/erz^ 



'-^[W-Pil-qW^"^ 

and expand them requiring that = at critical- 
ity, while the overlap (being a continuous function un- 
dergoing a second order phase transition) is small, e.g. 



q =< p'^erz^ >,= el3^q{l/[W - 
proximating as usual {J{W))w 
get the leading term as 



/5(1 - q)\ )w, then, ap- 
^ f{{W)) (annealing) we 



hence 



(< w; > -j3y 



(T^)=/3(1 + Ve), 



which recovers the critical line of the Hopfield model for 
{W) = 1 as it should. 



D. Appendix Three. The mean field biased 
repertoire: Entropic considerations 

A recently, pioneering experiment, and its analysis 
trough maximum entropy principle, has revealed a highly 
non-uniform usage in genes coding for antibodies in ze- 
brafish [39, 71]: In particular it has been proven that 
the sequence distribution follows a Zipf law and there 
is a massive reduction of diversity, so to say, the reper- 
toire is far from being completely expressed. As wc are 
going to use a mean-field approximation of this key re- 
sult, in this section, through standard information the- 
ory techniques, we highlight the intimate connection be- 
tween the size of the antibody's repertoire, its entropy 



P{Ak) 



^(l- 



-p)^-*= ~ 2-^i^(p)-^(T)+i1:-p)'^og(j 



(32) 

where p = P('^ = 1) = (1 + a)/2 and S(p) is the entropy 
of the probability distribution, defined as 



S{p) = -plogp- {l-p) log(l -p). 



(33) 



In the limit L >> 1, P{Ak) is non zero only if fc ~ 
pL, thus, ApL is the set of typical strings (having full 
probability 1;o be drawn). Each typical string ^ty^ has 
the same probability to occur 



P(*typ) ~ 1^^(1 - P)('-^^^ = 2-^^(f), 



(34) 



and the number of typical strings, i.e. the size of the 

repertoire, is 2^'^'^^'). When a = the entropy is maximal 
and the size of the repertoire is the maximum (5(1/2) = 1 
and B = 2^). On the contrary, if a ^ 0, the entropy is 
less than 1 and the size of the repertoire sensibly de- 
creases. In a more realistic scenario in which the entries 
are not identical distributed [39] , we would have different 
bias parameters ai for each entry, but the result would 
be quite the same: as soon as are different from 0, the 
size of the repertoire is 2^^(°-^ « 2^^^°'' = 2^, where 
this time 



S{a) = jj2S{ai). 



(35) 



Since we are interested just in reproducing the size of the 
repertoire, we used the simpler mean field approximation 
of the latter, where an effective bias parameter a replaces 
the whole vector (0,)^ 



=1- 



E. 



Appendix Four. Mimicking selection during the 
ontogenesis of B cells. 



In this section we deepen the simulations performed to 
mimic the ontogenesis of B cells and the related results. 

First, we recall that we model the ensemble of self- 
molecules as a set S of strings of length L, whose 
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entries are extracted independently via a proper distri- 
bution. The overall number of sclf-molcculcs is \S\ = Ns, 
that is, /i = 1, Ns- 

We generate S extracting each string entry i according 
to the simplest biased-distribution 



p,e,f($na)=5($r-i) 



(36) 



where S{x) is the Kronocker delta and a € [—1, 1]. 

Then, we generate newborn B cells, represented by the 
arbitrary string ^ and accept them whenever Eq. 15 is 
fulfilled. Wc find that within a wide region of the param- 
eters xp and xn the resulting final repertoire B exhibits 
a bias. In order to deepen this point we tackle the prob- 
lem from an analytical perspective trying to corroborate 
the numerical finding. 

We make the following ansatz for the distribution of 
string entries 

p,ep(*ria) = - + ^(*r - 1)^^' (37) 

where a can in principle range in [—1, 1] and we try to 
figure out the possible values of a so that all strings ex- 
tracted via (37) fulfill (with probability close to 1) the 
constraint in Eq. 15. In particular, we aim to figure 
out any correlation between the parameter a (assumed as 
fixed) and the free parameter a. Notice that the choice 
of Eq. 37 is consistent with the results presented in Ap- 
pendix Three and with our mean-field approach as it pro- 
vides the easiest distribution, possibly admitting a degree 
of bias (a 7^ 0), through which entries are identically and 
independently generated. 

Now, given the REM-like distribution [55] of comple- 
mentarities (1,2), in order to estimate maxsg${x(5', $)}, 
as suggested in [53, 54], one can approximate the ex- 
treme value distribution for x(\E', $) with a Gumbel 
distribution, whose peak, for large Ns, is located at 

{x)a,a + ^j2{{x^)aM - (x) a,a) log A^s , whcrc (•)„,„ dcuotes 
the average performed over the distributions Pseif($f ja) 
and Prep(*fk)i respectively. Recalling Eqs. 1, 36, 37, we 
have 



{x)a,a 

{X )a,a ~ {x)a,a 



(1 - aa), (38) 
{l-a^){l--a% (39) 



moreover, as to Ns, we can assume the rather general 
scaling Ns « {N^Y, with c > 0, largely consistent with 
immunogenetics measurements [56]; thus, we get 

2/ < (1 - aa) + V2c7(l - a^){l - a?) <2f + A, (40) 

where f = Xp/L and A = (xjv — Xp)2/i is the accessible 
gap (it provides a logarithmic measure of the correspond- 
ing allowed binding energies). 

In order to fix the value of the parameters, one can 
rely on indirect measurements, such as the survival prob- 
ability of new-born B cells, which is expected to be 



Psurv = 0.1 (see Fig. 4, left panel). Moreover, we expect 
that Xp > L/2, since two randomly generated strings 
display, on average, x = 1/2, and that C7 is relatively 
small, since the self-repertoire is expected to be sensi- 
tively smaller than the B-repertoire [45, 46, 61, 62]. 

Having set the parameters according to such con- 
straints, we tune a and we accordingly derive the val- 
ues of a which fulfill the inequality (40), these values are 
those compatible with the final repertoire. Interestingly, 
we find that a and a are correlated: positive values of a 
yield a biased mature repertoire with a > 0. 



F. Appendix Five. The robustness of the 
log-normal connectivity distribution for the 
idiotypic network 

Each string * has length L and displays, on average, a 
number p of non-null entries distributed according to the 
binomial P(p|a,L) = + a)/2]P[{l - a)/2]^-P. In 

the TDL Nb — >■ oo, the string length is divergent and we 
can approximate the previous distribution with a delta 
function peaked at the average value {p)a = (1 + a)L/2. 

The observable Xnf represents the number of comple- 
mentarities between two generic strings 'ifj,,'^^ € B, de- 
fined as 



L 

E 

k 



and has the expected values 

(X)a = 



L-1 



(41) 

(42) 
(43) 



over the distribution B{p\a, L). Notice that, in the TDL, 
the variance is vanishing and this distribution also con- 
verges to a delta peaked at (x)a- Hence, exploiting CLT, 
the stochastic variable x can be thought of as normally 
distributed with N{{x)a, {x)l/L)). 

From Xfiiy we can define more precisely the coupling 
strength J^y as 



J, 



fll/ 



(44) 



where positive (complementary matches) and negative 
(non-complementary matches) contributions to the cou- 
pling have been highlighted. The term exp(x) is, by 
definition, distributed according to the log-normal dis- 
tribution logJ\f{{x)a, (x)a/-^)- With slight algebraic ma- 
nipulations, we get that J is distributed according to 
log A/'((x)a(a + l)i - L, (x)a(a + 1)^^). whose probabil- 
ity distribution is 



PL,NBiJ\a^L,a) 



1 



jV2^{x)aia+l) 



[log J-(x>a(<» + l)J- + J-]^ 
e 2(x>2(c + l)2l, 



(45) 
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Recalling that L = 'j log Nb, we can write 

^[e^+2e-2]7/2 



{J)a 



N 



B 

2fe=+e-i]7 



(46) 
(47) 



being^= (x)a(a + l) > 0. 

We notice that, by properly tuning a and a, one can 
recover, in the TDL, different regimes characterized by 
different behaviors (finite, vanishing or diverging) for the 
average E{J) = (J) a and the variance V{J) = {J'^)a — 
{J^)a, respectively (see Fig. 5). 

It is worth stressing that a vanishing (J) a does not 
necessarily imply that the emerging topology is under- 
percolated. This remains true even assuming the stronger 
condition [3] J^^ = e[x^^(a+l)-l] exp[x^^(a+l)L-L], 
being O the Heaviside function. 

Let us now consider the weighted degree W and its 
distribution P{W\a,a, Nb)- First, we notice that W is 
a sum of log-normal variables, pairwise not correlated 
(as their corresponding receptors are independently ex- 
tracted through random VDJ reshuffling [39]). Then, W 
can be well approximated by a new log-normal random 
variable W = exp(x), where x is a Gaussian random vari- 
able with mean /x and variance cr^. As a result, we expect 
{W)a = exp(/i-hcrV2) and {W^)a = exp{2fi + a^). More- 
over, we can write {W)a ~ NB{J)a and (T4^^)a — {W)l « 
NB{{J'^)a — {J)a)i agreement with Bienayme's theo- 
rem. Now we can use the previous expressions to fix /x 
and cr^, recovering the Fenton- Wilkinson method for ap- 
proximating log-normal sums, where E{J) = {J)a = 
^[|^+2.-2]7/2^ (J2), = N'f+'-'^\ being = {1- 



a^)/2{a + 1)L > 0. Consequently, by properly tuning 
a and a, one can recover, in the thermodynamic limit, 
different regimes characterized by different behaviors (fi- 
nite, vanishing or diverging) for the average ( J)a and the 
variance V{J) = {J'^)a — {J)1,^ respectively, as reported 
in Fig. 5. 

In particular, we can write 



log 
log 



Nb{J)1 



V{J)l + i{J')a-{J)l)/NB 

{J')a-NB{J)1 

Nb{J)1 



(48) 
(49) 



through which we get the following distribution for W, 
to be taken also as an approximation for P{W\a, a, Nb) 



P{W) 



1 

Wv/27rcr 



(50) 



These results are corroborated by numerical data. 
Therefore, W is characterized by mean and variance 
which may assume a vanishing, or finite, or diverging 
value according to the value of a and a, similarly to what 
found for J. 
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